Analysis of Translational Correspondence in view of Sub-sentential Alignment
نویسنده
چکیده
This paper reports on the first results of an empirical study of translational correspondence in different text types for the English-Dutch language pair. A Gold Standard was created, which can be used as a standard data set for evaluating subsentential alignment. The manually indicated translational correspondences were analyzed in view of different heuristics used in existing sub-sentential alignment modules.
منابع مشابه
Sub-Sentential Alignment Method by Analogy
This paper describes a method for searching word correspondences between pairs of translation sentences. In the Example-Based Machine Translation, translation patterns can be extracted easily if word correspondences between pair of translation sentences are defined. The popular methods for aligning bilingual corpus at a sub-sentential level are unable to produce reliable result when the size of...
متن کاملUsing Punctuations and Lengths for Bilingual Sub-sentential Alignment
We present a new approach to aligning bilingual English and Chinese text at sub-sentential level by interleaving alphabetic texts and punctuations matches. With sub-sentential alignment, we expect to improve the effectiveness of alignment at word, chunk and phrase levels and provide finer grained and more reusable translation memory.
متن کاملAligning linguistically motivated phrases
In this paper, we describe the architecture of a sub-sentential alignment system that links linguistically motivated phrases in parallel texts. We conceive our sub-sentential aligner as a cascade model consisting of two phases. In the first phase, anchor chunks are linked on the basis of lexical correspondences and syntactic similarity. In the second phase, we will focus on the more complex tra...
متن کاملInterleaving Text and Punctuations for Bilingual Sub-sentential Alignment
We present a new approach to aligning bilingual English and Chinese text at sub-sentential level by interleaving alphabetic texts and punctuations matches. With sub-sentential alignment, we expect to improve the effectiveness of alignment at word, chunk and phrase levels and provide finer grained and more reusable translation memory.
متن کاملHierarchical Sub-sentential Alignment with Anymalign
We present a sub-sentential alignment algorithm that relies on association scores between words or phrases. This algorithm is inspired by previous work on alignment by recursive binary segmentation and on document clustering. We evaluate the resulting alignments on machine translation tasks and show that we can obtain state-ofthe-art results, with gains up to more than 4 BLEU points compared to...
متن کامل